Structural framework for combining speaker recognition methods
نویسندگان
چکیده
The paper describes a structural framework for the design of a speaker recognition system based on multiple models. This combination is not only at the recognition level, but also at a joint training of the models. This unified training of the models uses a common structure : a decomposition tree of the set of data of normalization speakers. For the experiments, the Gaussian Mixture Model and the Auto-Regressive Vectorial Model are the two models we have selected to test the structural framework of the speaker verification scoring combination. This approach has been tested on a subset of the 30”-NIST’97 Speaker Recognition Evaluation corpus. The list of the files of this subset (i.e., normalization, training and test) can be found at http://www-apa.lip6.fr/PAROLE/ICSLP2000/.
منابع مشابه
Methods of Combining Multiple Classifiers with Different Features and Their Applications to Text-Independent Speaker Identification
In practical applications of pattern recognition, there are often different features extracted from raw data which needs recognizing. Methods of combining multiple classifiers with different features are viewed as a general problem in various application areas of pattern recognition. In this paper, a systematic investigation has been made and possible solutions are classified into three framewo...
متن کاملCombining EigenVoices and structural MLLR for speaker adaptation
This paper considers the problem of speaker adaptation of acoustic models in speech recognition. We have investigated four different possible methods which integrate the concepts of both Structural Maximum Likelihood Linear Regression (SMLLR) and EigenVoices technique to adapt the Gaussian means of the speaker independant models for a new speaker. The experiments were evaluated using the speech...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملCombining Speaker and Speech Re
This paper presents a general framework for the integration of speaker and speech recognizers. The framework poses the problem of combining speech and speaker recognizers as the joint maximization of the a posteriori probability of the word sequence and speaker given the observed utterance. It is shown that the posteriori probability can be expressed as the product of four terms: a likelihood s...
متن کامل